Predicting Future Instance Segmentations by Forecasting Convolutional Features

نویسندگان

Pauline Luc

Camille Couprie

Yann LeCun

Jakob Verbeek

چکیده

Anticipating future events is an important prerequisite towards intelligent behavior. Video forecasting has been studied as a proxy task towards this goal. Recent work has shown that to predict semantic segmentation of future frames, forecasting at the semantic level is more effective than forecasting RGB frames and then segmenting these. In this paper we consider the more challenging problem of future instance segmentation, which additionally segments out individual objects. To deal with a varying number of output labels per image, we develop a predictive model in the space of fixed-sized convolutional features of the Mask R-CNN instance segmentation model. We apply the “detection head” of Mask R-CNN on the predicted features to produce the instance segmentation of future frames. Experiments show that this approach significantly improves over baselines based on optical flow.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Forecasting Hand and Object Locations in Future Frames

This paper presents an approach to forecast future locations of human hands and objects. Given an image frame, the goal is to predict presence and location of hands and objects in the future frame (e.g., 5 seconds later), even when they are not visible in the current frame. The key idea is that (1) an intermediate representation of a convolutional object recognition model abstracts scene inform...

متن کامل

Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework

Deep learning approaches have reached a celebrity status in artificial intelligence field, its success have mostly relied on Convolutional Networks (CNN) and Recurrent Networks. By exploiting fundamental spatial properties of images and videos, the CNN always achieves dominant performance on visual tasks. And the Recurrent Networks (RNN) especially long short-term memory methods (LSTM) can succ...

متن کامل

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

We present a box-free bottom-up approach for the tasks of pose estimation and instance segmentation of people in multi-person images using an efficient single-shot model. The proposed PersonLab model tackles both semantic-level reasoning and object-part associations using part-based modeling. Our model employs a convolutional network which learns to detect individual keypoints and predict their...

متن کامل

Semantic Segmentation with Deep Learning

We present a deep convolutional neural network approach for producing semantic segmentations. First, we generalize the architecture of the successful Alexnet network [7] to directly predict coarse segmentations. Second, we produce full resolution segmentations by re-ranking a diverse set of plausible segmentation proposals generated from a recent state of the art approach [9].

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Predicting Future Instance Segmentations by Forecasting Convolutional Features

نویسندگان

چکیده

منابع مشابه

Forecasting Hand and Object Locations in Future Frames

Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

Semantic Segmentation with Deep Learning

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

عنوان ژورنال:

اشتراک گذاری